home *** CD-ROM | disk | FTP | other *** search
-
-
-
-
- CRUSH and UNCRUSH documentation
-
- Versions
- CRUSH : 0.81
- UNCRUSH : 0.61
-
- First Public Release
- Designed and created by Bill Davidson
-
-
-
-
- DISCLAIMER :
- This is free. You may distribute this program as long as both the source
- code and the compiled CRUSH and UNCRUSH and this documentation are
- included. This software is basically just for learning purposes. You
- may edit it but please include the original files with your own. Most
- importantly, DISTRIBUTE! The author of these programs is not responsible
- for any damage to anything or anyone as a result of using this program.
- In other words, leave me out!
-
- IMPORTANT :
-
- In order for this program to run, the desired file to be compressed must
- be called "q.q". The reason for that is I really didn't want to mess with
- accepting parameters from the command line with the limited time I had to
- create this program (This was originally made for the 1994 science fairs.)
- The output file is called "w.w". When uncompressing, the input file must be
- called "w.w" and the output file is called "e.e". I chose these unique file
- names, because I had nothing better to call them.
- Also, this compression program will only execute on text files. The reason
- is as follows: In my compression program, it doesn't open files in binary
- mode and it looks for an end-of-file marker. In compiled files, such as
- executables and spreadsheet files, eof markers are everywhere, so it ends
- up compressing 512 bytes of a 100 K file. You can try it if you like. All it
- executes on is ASCII text. It can be extended ASCII, but it can't be
- compiled.
- For my science fair, I tested this program against 9 other commercial
- data compression programs: AR002, ARJ241, HPACK, LARC333, LZHUFT, PAK251,
- PKZIP204g, SQZ1083, and ZOO210. My study and the results of it will be
- released at a later date (hopefully in a month.)
-
-
- How it Works
-
- The compression engine used in CRUSH is ASCII text replacement. It uses a
- preset dictionary that contains the strings it looks for. The strings were
- derived from James A. Storer's book, DATA COMPRESSION : METHODS AND THEORIES.
-
- The compression program reads a line from the input file and turns it into
- a string. Then It searches the line for header characters. If it finds one,
- then it doubles it. This is important because my decompression program
- looks for header characters to decode. If there is a header character in the
- original file and my compression program doesn't do anything with it, the my
- decompression program will try to uncode something that isn't supposed to
- by uncoded. Once it is done, then, starting with the 8 letter arrays down to
- the 3 letter arrays, it searches for the strings in the arrays in the modified
- line-string. If it finds one, it deletes the string and assigns a header
- character, there is one for each of the letter arrays, and a code character.
- The code character is assembled using the position of the replacement string
- in the array found, adding 145, and taking the cooresponding ASCII character.
- By adding 145, it gets the characters out of the 0 to 31 ASCII set that DOS
- uses and before my header characters. After it is done with that coding, it
- moves past that replaced string and on down the line-string. When it reaches
- the end of the line, it writes that line to the output file and reads the
- next line of the input file. Once it reaches the end-of-file marker, it
- saves the ouput file and quits.
-
- Uncoding is quite simple. It reads a coded line and turns it into a string.
- It then searches for the header characters. If it finds one, then it checks
- the character after it. If it is the same character as the header character,
- then it deletes one and moves on. If it isn't, then it gets the ordinal value
- for that character and subtracts 145, the offset number. Then it finds the
- cooresponding string with that number, deletes the header character and
- the code character and inserts the string from the array. Then it moves down
- the line. Once it reaches the end of the line, it writes that line-string to
- the output file. Once it reaches the end of the file, it saves the output file
- and quits.
-
-
- If you want to get in touch with me you can
-
- write : Bill Davidson
- 1811 S. 73rd Circle
- Fort Smith, AR 72903
-
- call : (501)452-7043
-
- send : Write mail to all in the Data Compression forum on
- Compuserve, or send a private letter to my dad on
- his account number : 73361,1217
-
-
- or call any of these BBS's that I call frequently in Fort Smith and
- leave me a message :
-
- Paradox of Arkansas 484-0944
- 484-1043
- AR/OK PC Users Group 646-0543
- The Serial Connection 785-2408
- 785-2477
- Realms of Thunder 484-0884
-
-
- These numbers are valid as of June, 1994
-
-
- If you do decide to modify this file, please notify me through one of these
- sources. Contact me if you have any info on data compression program
- writing. I am currently looking for an idea next year revolving around this,
- so if you have an idea, tell me.
- Please do not critize me for bad code writing or such. I am only a 2 year
- programmer and leaving Pascal for C++ and Assembly...
-
- Have fun. Hope it sparks some ideas for you!
-
- Bill Davidson
-
-
-